9 research outputs found
Ginger Cannot Cure Cancer: Battling Fake Health News with a Comprehensive Data Repository
Nowadays, Internet is a primary source of attaining health information.
Massive fake health news which is spreading over the Internet, has become a
severe threat to public health. Numerous studies and research works have been
done in fake news detection domain, however, few of them are designed to cope
with the challenges in health news. For instance, the development of
explainable is required for fake health news detection. To mitigate these
problems, we construct a comprehensive repository, FakeHealth, which includes
news contents with rich features, news reviews with detailed explanations,
social engagements and a user-user social network. Moreover, exploratory
analyses are conducted to understand the characteristics of the datasets,
analyze useful patterns and validate the quality of the datasets for health
fake news detection. We also discuss the novel and potential future research
directions for the health fake news detection
ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research
First identified in Wuhan, China, in December 2019, the outbreak of COVID-19
has been declared as a global emergency in January, and a pandemic in March
2020 by the World Health Organization (WHO). Along with this pandemic, we are
also experiencing an "infodemic" of information with low credibility such as
fake news and conspiracies. In this work, we present ReCOVery, a repository
designed and constructed to facilitate research on combating such information
regarding COVID-19. We first broadly search and investigate ~2,000 news
publishers, from which 60 are identified with extreme [high or low] levels of
credibility. By inheriting the credibility of the media on which they were
published, a total of 2,029 news articles on coronavirus, published from
January to May 2020, are collected in the repository, along with 140,820 tweets
that reveal how these news articles have spread on the Twitter social network.
The repository provides multimodal information of news articles on coronavirus,
including textual, visual, temporal, and network information. The way that news
credibility is obtained allows a trade-off between dataset scalability and
label accuracy. Extensive experiments are conducted to present data statistics
and distributions, as well as to provide baseline performances for predicting
news credibility so that future methods can be compared. Our repository is
available at http://coronavirus-fakenews.com.Comment: Proceedings of the 29th ACM International Conference on Information
and Knowledge Management (CIKM '20
A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability
Graph Neural Networks (GNNs) have made rapid developments in the recent
years. Due to their great ability in modeling graph-structured data, GNNs are
vastly used in various applications, including high-stakes scenarios such as
financial analysis, traffic predictions, and drug discovery. Despite their
great potential in benefiting humans in the real world, recent study shows that
GNNs can leak private information, are vulnerable to adversarial attacks, can
inherit and magnify societal bias from training data and lack interpretability,
which have risk of causing unintentional harm to the users and society. For
example, existing works demonstrate that attackers can fool the GNNs to give
the outcome they desire with unnoticeable perturbation on training graph. GNNs
trained on social networks may embed the discrimination in their decision
process, strengthening the undesirable societal bias. Consequently, trustworthy
GNNs in various aspects are emerging to prevent the harm from GNN models and
increase the users' trust in GNNs. In this paper, we give a comprehensive
survey of GNNs in the computational aspects of privacy, robustness, fairness,
and explainability. For each aspect, we give the taxonomy of the related
methods and formulate the general frameworks for the multiple categories of
trustworthy GNNs. We also discuss the future research directions of each aspect
and connections between these aspects to help achieve trustworthiness
Graph-Augmented Normalizing Flows for Anomaly Detection of Multiple Time Series
Anomaly detection is a widely studied task for a broad variety of data types;
among them, multiple time series appear frequently in applications, including
for example, power grids and traffic networks. Detecting anomalies for multiple
time series, however, is a challenging subject, owing to the intricate
interdependencies among the constituent series. We hypothesize that anomalies
occur in low density regions of a distribution and explore the use of
normalizing flows for unsupervised anomaly detection, because of their superior
quality in density estimation. Moreover, we propose a novel flow model by
imposing a Bayesian network among constituent series. A Bayesian network is a
directed acyclic graph (DAG) that models causal relationships; it factorizes
the joint probability of the series into the product of easy-to-evaluate
conditional probabilities. We call such a graph-augmented normalizing flow
approach GANF and propose joint estimation of the DAG with flow parameters. We
conduct extensive experiments on real-world datasets and demonstrate the
effectiveness of GANF for density estimation, anomaly detection, and
identification of time series distribution drift.Comment: ICLR 2022. Code is available at https://github.com/EnyanDai/GAN
Towards Robust Graph Neural Networks for Noisy Graphs with Sparse Labels
Graph Neural Networks (GNNs) have shown their great ability in modeling graph
structured data. However, real-world graphs usually contain structure noises
and have limited labeled nodes. The performance of GNNs would drop
significantly when trained on such graphs, which hinders the adoption of GNNs
on many applications. Thus, it is important to develop noise-resistant GNNs
with limited labeled nodes. However, the work on this is rather limited.
Therefore, we study a novel problem of developing robust GNNs on noisy graphs
with limited labeled nodes. Our analysis shows that both the noisy edges and
limited labeled nodes could harm the message-passing mechanism of GNNs. To
mitigate these issues, we propose a novel framework which adopts the noisy
edges as supervision to learn a denoised and dense graph, which can down-weight
or eliminate noisy edges and facilitate message passing of GNNs to alleviate
the issue of limited labeled nodes. The generated edges are further used to
regularize the predictions of unlabeled nodes with label smoothness to better
train GNNs. Experimental results on real-world datasets demonstrate the
robustness of the proposed framework on noisy graphs with limited labeled
nodes